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Endogenous Versus Exogenous Shocks in Complex Networks: an Empirical Test Usinj 

Book Sale Ranking 
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We study the precursory and recovery signatures accompanying shocks in complex networks, that 
we test on a unique database of the amazon.com ranking of book sales. We find clear distinguishing 
signatures classifying two types of sales peaks. Exogenous peaks occur abruptly and are followed 
by a power law relaxation, while endogenous peaks occur after a progressively accelerating power 
law growth followed by an approximately symmetrical power law relaxation which is slower than 
for exogenous peaks. These results are rationalized quantitatively by a simple model of epidemic 
propagation of interactions with long memory within a network of acquaintances. The observed 
relaxation of sales implies that the sales dynamics is dominated by cascades rather than by the 
direct effects of news or advertisements, indicating that the social network is close to critical. 
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A fundamental question in the theory of out-of- 
equilibrium systems is whether the response function 
to external kicks can be related to spontaneous inter- 
nal fluctuations 0. At equilibrium, this is solved by 
the fluctuation-dissipation theorem connecting suscepti- 
bility and noise. In many complex systems, this ques- 
tion amounts to distinguishing between endogeneity and 
exogeneity and is important for understanding the rela- 
tive effects of self-organization versus external impacts. 
This is difficult in most physical systems because exter- 
nally imposed perturbations may lie outside the complex 
attractor which itself may exhibit bifurcations. There- 
fore, observable perturbations are often misclassified. For 
this reason, we have studied a non-physical system in 
which the dividing line between endogenous and exoge- 
nous shocks is clear in the hope that it will lead to insights 
about complex physical systems. We study the dynamics 
of commercial growth and its relaxation in the social sys- 
tem of interacting buyers, obtained from an Amazon.com 
database of book sales. We do see a characteristic dif- 
ference in behavior between endogenous and exogenous 
shocks. 
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FIG. 1: Time evolution over a year and a half of the sales 
per day of two books: Book A (bottom, left scale) is "Strong 
Women Stay Young" by Dr. M. Nelson and Book B (top, right 
scale) is "Heaven and Earth (Three Sisters Island Trilogy)" 
by N. Roberts. 
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Every book that has sold at least one copy on the online 
retailer Amazon is automatically assigned a sales rank. 
Typically, two (respectively ten) sales a day puts a title in 
the top 10,000 (respectively 1,000) sellers. The top 100 
(respectively 10) sell more than about 30 (respectively 
100) books per day through Amazon. Its American web- 
site, Amazon.com, updates the ranks of its top 10,000 
books every hour, according to a formula accounting for 
recent sales and the entire sales history of the book. Di- 
rect sales are confidential data but their statistical prop- 
erties can be reconstructed approximately by careful ob- 



servations |2j . The complementary cumulative distribu- 
tion P(s) of sales s can be approximated by a stationary 
power law P(s) — C/s^ with // w 2 in the range of sales 
from a few books sold per day to a few hundreds (see 
figure in 

0)- 

We use this power law to transform book 
ranks r(s) = NP(s) into sales s according to the formula 
s = (JVC/r) 1 /' 1 , where N is the total number of books 
used to normalize the distribution. Thus, a time series of 
the rank r of a given book as a function of time, sampled 
at a given rate, can be transformed into a time series of 
instantaneous sales flux, through this conversion. 
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The time series of ranks of thousands of books have 
been recorded with a six-hour sampling rate by JungleS- 
can (http://www.junglescan.com). From the hundreds 
of books with at least one year of recording, we have se- 
lected all that have reached the top 50 in Amazon sales 
rank. We qualify a peak in sales as a local maximum 
over a three-month time window which is at least k = 2.5 
times larger than the average of the time series over the 
three months. In addition, we request that there is at 
least 15 days of data after each peak and 4 days before. 
Out of some 14,000 books on Junglescan on April 2004, 
our algorithm detects about 1000 such peaks. Fig. 
shows about 1.5 years of data for two books, which are 
illustrative of the two classes found in this study. Book A 
jumped on June 5, 2002, from rank in the 2,000s to rank 
6 in less than 12 hours. On June 4, 2002, the New York 
Times published an article crediting the "groundbreak- 
ing research done by Dr. Miriam Nelson" and advising 
the female reader, interested in having a youthful post- 
menopausal body, to buy the book and consult it directly 
■ This case is the archetype of an "exogenous" shock. 

In contrast, Book B culminated at the end of June 2002 
after a slow and continuous growth, with no such newspa- 
per article, followed by a similar almost symmetrical de- 
cay, the entire process taking about 4 months. The peak 
for Book B belongs to the class of endogenous shocks as 
described below. Qualitatively, such endogenous growth 
is well explained in Ref. 0] by taking the example of 
the book "Divine Secrets of the Ya-Ya Sisterhood" by R. 
Wells, which became a bestseller two years after publica- 
tion, with no major advertising campaign. Following the 
reading of this originally small budget book, "Women be- 
gan forming Ya-Ya Sisterhood groups of their own [...]. 
The word about Ya-Ya was spreading [...] from read- 
ing group to reading group, from Ya-Ya group to Ya-Ya 
group" jj]. 

Such social epidemic process can be captured by the 
following simple model. The instantaneous sales flux of a 
given book results from a combination of external forces 
such as news, advertisement, selling campaign, and of 
social influences in which each past reader may impreg- 
nate other potential readers in her network of acquain- 
tances with the desire to buy the book. This impact of 
a reader onto other readers is not instantaneous as peo- 
ple react at a variety of time scales. This latency can be 
described by a memory kernel 4>{t — ti) giving the prob- 
ability that a buy at time ti leads to another buy at a 
later time t by another person in direct contact with the 
first buyer. Starting from an initial buyer (the "mother" 
buyer) who notices the book (either from exogenous news 
or by chance), she may trigger buying by first-generation 
"daughters," who themselves propagate the buying drive 
to their own friends, who become second-generation buy- 
ers, and so on. We describe the sum of all buys by a 



conditional Poisson branching process with intensity 

A(t) = S(t) + J2 Mi 4>{t - k) , (1) 

i\U<t 

where /j,i is number of potential buyers influenced by the 
buyer i who bought earlier at time ti. S(t) is the rate 
of sales initiated spontaneously without influence from 
other previous buyers; it can be decomposed into the 
sum of a white-noise process with power law distribution 
representing small triggering factors (which contribute 
to the endogenous shocks) and a jump process (Dirac 
distributions) modeling massive media coverage and ad- 
vertisement campaigns for instance (exogenous shocks). 
We note that the distinction between endogenous and ex- 
ogenous is in general murky and cannot be decided with 
100% certainty for most books; the correct approach is 
probabilistic and relies on the analysis of an ensemble of 
cases, as presented below. 

Taking the ensemble average of Q gives the self- 
consistent mean-field equation 

s(t) = (A(t)) = S{t) +n I dr cf>(t - t) s(t) , (2) 

J — oo 

where n — (/i) is the average number of buys of first gen- 
eration triggered by any "mother" within her acquain- 
tance network (also called the branching ratio) and de- 
pends on the network topology as well as on the social 
behavior of influences. The Green function K{t) of (J2J) 
corresponding to S(t) = 8{t) (exogenous shock) is easily 
obtained by taking its Laplace transform and corresponds 
to the exogenous response function. We postulate that 
the "bare propagator" is of the form <f>(t — ti) ~ l/t 1+ 
with < 8 < 1 corresponding to a long-memory process 
that are commonly observed for sale relaxations. Then, 

s cxo = K(t) ~ \/{t - U) 1 - 9 , (3) 

for t < t* cx 1/(1 - n) 1 ' 9 and K{t) ~ l/t 1+e for 
t > t* and n < 1. Expression J2J) can then be writ- 
ten s(t) — J^^dr K(t — t) S(t). Close to the critical 
point n « 1, the cascade of generations embodied in (0) 
renormalizes the memory kernel (f>(t — ti) into a dressed 
or renormalized memory kernel Kit — ti) ||, giving the 
probability that a buy at time ti leads to another buy 
by another person at a later time t through any possible 
generation lineage (J Q K(t)dt — n/(l — n) is the aver- 
age number of buys triggered by one buy). Thus, if we 
interpret the sharp peak of Book A observed in Fig. Q] 
as the impact on the social network of women created by 
the extraordinarily favorable appraisal of the New York 
Times, the decay of the sales flow that followed gives a 
direct measure of its response function K(t): we find in- 
deed a power law (not shown) as predicted by © with 
6 = 0.3 ± 0.1. Such power law dependence of the relax- 
ation rate of book sales on Amazon.com is the hallmark 
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of a long-memory process characterizing the dynamics of 
influences within the complex social network. These laws 
are similar to the relaxation of seismic activity after an 
earthquake, known as Omori's law |(J and has also been 
found in the response rates of internauts to an exogenous 
shock, such as the publication of the URL in a newspaper 
interview 0] , and in the relaxation of volatility shocks in 
the stock market ||. 

In the absence of strong external influences, a peak in 
book sales can occur spontaneously due to the interplay 
between a continuous stochastic flow of small external 
news and the amplifying impact of the epidemic cascade 
of social influences. We propose that this mechanism 
explains the sales time series of books such as Book B 
shown in Fig. ^ Technically, the problem amounts to 
calculating the average sales trajectory before and after a 
peak, conditioned on the existence of a peak. We use the 
standard result for stochastic processes X(t) with finite 
variance and covariance that (X(t)\X(t = 0) = Xq) cx 
Cov(X(t), Xq). Applying this result to X(t) defined in 
Q gives Cov(A(», A ) cx J^dr K(t - r) K(-t). This 
expression gives the average growth of the sales before 
such an "endogenous" peak and the relaxation after the 
peak, which is proportional to 

ScndoW-l/l^-icI 1 - 29 , (4) 

for K{t) given by © 5]. The prediction that the relax- 
ation following an exogenous shock should happen faster 
© (larger exponent 1 — 9) than for an endogenous 
shock (with exponent 1 — 29) reflects the fact that an 
endogenous shock results from a precursory process that 
inpregnates the network much more over a longer time 
and thus has a longer lived influence. The prediction Q 
is verified for Book B with good precision (not shown) 
with the same 9 — 0.3 ±0.1. 

Among the thousand peaks in our database, many are 
followed by complicated trajectories probably due to mul- 
tiple external influences. Nevertheless, we fitted all sales 
time dependence after each of the thousand peaks by a 
power law of the type Q or (@J and selected for further 
analysis those that give a correlation coefficient R better 
than 0.95. This leaves us with 138 peaks. This criterion 
ensures that we extract clear response functions among 
the generally complex time series. Our purpose is not 
so much to show that the response function is a power 
law but that, if it is, then we classify another class of 
so-called "endogenous" peaks with precursory and decay 
rates that can be predicted. Our results below are not 
changed qualitatively by changing this threshold on R in 
the range from 0.8 to close to 1. We also find the good 
feature that the classification discussed below in terms of 
exogenous versus endogenous peaks improves (fewer mis- 
classifications) as the threshold increases, conforting the 
hypothesis that the response function of a book sales to a 
strong advertisement is of the form @ . Our goal is thus 
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Value of the power law exponentp for aftershocks 

FIG. 2: Histogram of the exponents p of the fits of book sales 
as a function of time by the power law ~ l/(t — t c ) p . The 
arrows indicate the values p — 1 — 6 (expected for exogenous 
peaks) and p = 1 — 26 (for endogenous peaks), with 9 sa 0.3 
(see text). 



to extract the pure cases and show that they are consis- 
tent with the model, which has rigid predictions linking 
the exponents of the precursory and relaxation behavior 
of exogenous and endogenous cases. In turn, this should 
permit to use the general formulation Q with an ar- 
bitrary time-dependent source term to describe general 
situations. 

For each of the 138 selected peaks, we measure the 
exponent p characterizing the power laws l/(t — t c ) p de- 
scribing the relaxation of the sales after each peak. We 
perform a mean-square fit from the peak up to a time 
tendt where t en d is varied from 15 days to the first min- 
inum of the sales between 25 days and 6 months after the 
peak. Once all fits are performed for these different time 
windows, we select the window with the highest correla- 
tion coefficient of the fit to the data (we also used other 
criteria such as selecting the window with the smaller ex- 
ponent, without altering the results significantly). The 
histogram of p- values shown in Fig. exhibits two dis- 
tinct clusters, one with a median at p as 0.75 and the 
other with a median at p « 0.45, compatible with the 
predictions 10 and with the choice 9 = 0.3 ± 0.1. 
This classifies the first (respectively second) cluster as 
exogenous (respectively endogenous). This histogram is 
robust with respect to variations in our procedure, such 
as different windows and peak thresholds. 

According to our model, the peaks belonging to the 
cluster with high p (p « 0.7) should be in the exogenous 
class, and therefore should be reached by abrupt jumps 
without detectable precursory growth. Conversely, the 
peaks belonging to the cluster with p w 0.4 should be 
in the endogenous class, and therefore should be asso- 
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ciated with a progressive power law precursory growth 
l/(*c — t) p with exponent p = 1 — 20. To check this pre- 
diction, the following algorithm categorizes the growth of 
sales before each of the peaks according to its accelera- 
tion pattern. We differentiate between peaks which have 
an increase in sales in a four day period by a factor of at 
least k cxo prior to the peak and peaks that have an in- 
crease in sales by a factor of less than k CD0[0 . We find that 
the larger k cxo is, the larger is the exponent p of the aver- 
age relaxation for books that have an increase in sales by 
a factor more than fc eX o- Conversely, the smaller fc e ndo is, 
the smaller is the exponent p of the average relaxation for 
books that have an increase in sales by a factor less than 
k cxo . These results confirm the predictions of the model. 
Quantitatively, we first apply a stringent selection with 
k C xo = 30, A; cn( jo = 2. This implies that peaks, for which 
the acceleration factor is between 2 and 30, are disgarded 
in order to get clean signals. Out of the 138 peaks, this 
leaves us with 30 peaks. We then average the precursory 
and relaxation behavior of sales for the class of peaks 
classified as endogenous with fc en do = 2 and as exogenous 
with k QXO = 30. Fig. [3] confirms nicely the existence of 
two classes, with a symmetric precursory and relaxation 
behavior for endogenous peaks, and with all three power 
laws accounted for by a single value 8 = 0.3 ± 0.1. The 
two clusters remain significant for less restrictive k exo and 
&cndo- While the theoretical predictions have been de- 
rived for ensemble averages, we find that more than 80% 
among the 138 peaks we have analyzed with at least a 
year of data and that reached the top 50 among all books 
also obey l/(t-t c ) 1-9 or l/^-^) 1 " 26 individually with 
reasonable precision. Our finding that the classifications 
of endogenous versus exogenous match so well on indi- 
vidual books is extremely significant with only 1 chance 
in about 10 8 that this result could be obtained by chance 
only. 

The values of the exponents smaller than one (close 
to 1 — 6 and 1 — 28) both for exogenous and endoge- 
nous relaxations imply that the sales dynamics is domi- 
nated by cascades involving high-order generations rather 
than by interactions stopping after first-generation buy 
triggering. Indeed, if buys were initiated mostly by the 
direct effects of news or advertisements, and not much 
by triggering cascades in the acquaintance network, the 
cascade model predicts that we should then measure an 
exponent 1 + 8 given by the "bare" memory kernel (f>(t). 
This implies that the average number n of impregnated 
buyers per initial buyer in the social epidemic model is 
on average very close to the critical value 1, because the 
renormalization from <f>(t) to K(t) given by ® only op- 
erates close to criticality characterized by the occurrence 
of large cascades of buys. This offers a new signature of 
criticality in self-organized networks || . 

Extreme events in complex physical systems, particu- 
larly those which seem to involve self-organized critical- 
ity, are often viewed as having an endogenous source [lOj . 
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FIG. 3: Precursory ("foreshock") and relaxation ("after- 
shock") of sales around peaks obtained after averaging over 
books in each class classified according to the precursory ac- 
celeration (see text). 



Our work shows that the issue is not clear-cut as endoge- 
nous and exogenous shocks may lead to similar power law 
signatures, which can however be distinguished by a care- 
ful classification. This offers new ideas for probing self- 
organized critical systems as done recently for the Olami- 
Feder-Christensen sandpile model ^l| • We also note that 
the distinction between jammed states (constructed by 
fast processes) ^2| versus fragile states (formed by slow 
and delicate accumulation of perturbations) of gran- 
ular media and other "soft-matter" systems is based in 
part on the nature of their preparation and on their re- 
sponse to finite and short-lived perturbations versus in- 
finitesimal continuously repeated ones. Recognizing the 
importance of the nature of the perturbation as suggested 
here could provide new insights in the organization of 
granular media and new experimental questions, such as 
new ways of analyzing the history. Similar considera- 
tions apply to memory retrieval using hysteresis loops 
in magnets 0]. More generally, physical systems with 
many competing equilibria such as glasses and spinglasses 
are known to betray their history-dependent organization 
differently depending on whether they are subjected to 
large non-local perturbations versus continuous slow forc- 
ing 0|. Our classification of endogenous versus exoge- 
nous shocks presented here should encourage researchers 
to analyze complex physical systems similarly. 
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